541 research outputs found
Analyzing the Performance of Multilayer Neural Networks for Object Recognition
In the last two years, convolutional neural networks (CNNs) have achieved an
impressive suite of results on standard recognition datasets and tasks.
CNN-based features seem poised to quickly replace engineered representations,
such as SIFT and HOG. However, compared to SIFT and HOG, we understand much
less about the nature of the features learned by large CNNs. In this paper, we
experimentally probe several aspects of CNN feature learning in an attempt to
help practitioners gain useful, evidence-backed intuitions about how to apply
CNNs to computer vision problems.Comment: Published in European Conference on Computer Vision 2014 (ECCV-2014
Contractive De-noising Auto-encoder
Auto-encoder is a special kind of neural network based on reconstruction.
De-noising auto-encoder (DAE) is an improved auto-encoder which is robust to
the input by corrupting the original data first and then reconstructing the
original input by minimizing the reconstruction error function. And contractive
auto-encoder (CAE) is another kind of improved auto-encoder to learn robust
feature by introducing the Frobenius norm of the Jacobean matrix of the learned
feature with respect to the original input. In this paper, we combine
de-noising auto-encoder and contractive auto- encoder, and propose another
improved auto-encoder, contractive de-noising auto- encoder (CDAE), which is
robust to both the original input and the learned feature. We stack CDAE to
extract more abstract features and apply SVM for classification. The experiment
result on benchmark dataset MNIST shows that our proposed CDAE performed better
than both DAE and CAE, proving the effective of our method.Comment: Figures edite
Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks
In this paper we propose and investigate a novel nonlinear unit, called
unit, for deep neural networks. The proposed unit receives signals from
several projections of a subset of units in the layer below and computes a
normalized norm. We notice two interesting interpretations of the
unit. First, the proposed unit can be understood as a generalization of a
number of conventional pooling operators such as average, root-mean-square and
max pooling widely used in, for instance, convolutional neural networks (CNN),
HMAX models and neocognitrons. Furthermore, the unit is, to a certain
degree, similar to the recently proposed maxout unit (Goodfellow et al., 2013)
which achieved the state-of-the-art object recognition results on a number of
benchmark datasets. Secondly, we provide a geometrical interpretation of the
activation function based on which we argue that the unit is more
efficient at representing complex, nonlinear separating boundaries. Each
unit defines a superelliptic boundary, with its exact shape defined by the
order . We claim that this makes it possible to model arbitrarily shaped,
curved boundaries more efficiently by combining a few units of different
orders. This insight justifies the need for learning different orders for each
unit in the model. We empirically evaluate the proposed units on a number
of datasets and show that multilayer perceptrons (MLP) consisting of the
units achieve the state-of-the-art results on a number of benchmark datasets.
Furthermore, we evaluate the proposed unit on the recently proposed deep
recurrent neural networks (RNN).Comment: ECML/PKDD 201
3D freeform surfaces from planar sketches using neural networks
A novel intelligent approach into 3D freeform surface reconstruction from planar sketches is proposed. A multilayer perceptron (MLP) neural network is employed to induce 3D freeform surfaces from planar freehand curves. Planar curves were used to represent the boundaries of a freeform surface patch. The curves were varied iteratively and sampled to produce training data to train and test the neural network. The obtained results demonstrate that the network successfully learned the inverse-projection map and correctly inferred the respective surfaces from fresh curves
Design of a General-Purpose MIMO Predictor with Neural Networks
A new multi-step predictor for multiple-input, multiple-output (MIMO) systems is proposed. The output prediction of such a system is represented as a mapping from its historical data and future inputs to future outputs. A neural network is designed to learn the mapping without re quiring a priori knowledge of the parameters and structure of the system. The major problem in de veloping such a predictor is how to train the neural network. In case of the back propagation algorithm, the network is trained by using the network's output error which is not known due to the unknown predicted future system outputs. To overcome this problem, the concept of updating, in stead of training, a neural network is introduced and verified with simulations. The predictor then uses only the system's historical data to update the configuration of the neural network and always works in a closed loop. If each node can only handle scalar operations, emulation of an MIMO mapping requires the neural network to be excessively large, and it is difficult to specify some known coupling effects of the predicted system. So, we propose a vector-structured, multilayer perceptron for the predictor design. MIMO linear, nonlinear, time-invariant, and time-varying systems are tested via simulation, and all showed very promising performances.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/68861/2/10.1177_1045389X9400500206.pd
A Neural Networks Committee for the Contextual Bandit Problem
This paper presents a new contextual bandit algorithm, NeuralBandit, which
does not need hypothesis on stationarity of contexts and rewards. Several
neural networks are trained to modelize the value of rewards knowing the
context. Two variants, based on multi-experts approach, are proposed to choose
online the parameters of multi-layer perceptrons. The proposed algorithms are
successfully tested on a large dataset with and without stationarity of
rewards.Comment: 21st International Conference on Neural Information Processin
Recurrent Latent Variable Networks for Session-Based Recommendation
In this work, we attempt to ameliorate the impact of data sparsity in the
context of session-based recommendation. Specifically, we seek to devise a
machine learning mechanism capable of extracting subtle and complex underlying
temporal dynamics in the observed session data, so as to inform the
recommendation algorithm. To this end, we improve upon systems that utilize
deep learning techniques with recurrently connected units; we do so by adopting
concepts from the field of Bayesian statistics, namely variational inference.
Our proposed approach consists in treating the network recurrent units as
stochastic latent variables with a prior distribution imposed over them. On
this basis, we proceed to infer corresponding posteriors; these can be used for
prediction and recommendation generation, in a way that accounts for the
uncertainty in the available sparse training data. To allow for our approach to
easily scale to large real-world datasets, we perform inference under an
approximate amortized variational inference (AVI) setup, whereby the learned
posteriors are parameterized via (conventional) neural networks. We perform an
extensive experimental evaluation of our approach using challenging benchmark
datasets, and illustrate its superiority over existing state-of-the-art
techniques
Explicit Computation of Input Weights in Extreme Learning Machines
We present a closed form expression for initializing the input weights in a
multi-layer perceptron, which can be used as the first step in synthesis of an
Extreme Learning Ma-chine. The expression is based on the standard function for
a separating hyperplane as computed in multilayer perceptrons and linear
Support Vector Machines; that is, as a linear combination of input data
samples. In the absence of supervised training for the input weights, random
linear combinations of training data samples are used to project the input data
to a higher dimensional hidden layer. The hidden layer weights are solved in
the standard ELM fashion by computing the pseudoinverse of the hidden layer
outputs and multiplying by the desired output values. All weights for this
method can be computed in a single pass, and the resulting networks are more
accurate and more consistent on some standard problems than regular ELM
networks of the same size.Comment: In submission for the ELM 2014 Conferenc
Soft Computing Models for the Development of Commercial Conversational Agents
Proceedings of: 6th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2011). Salamanca, April 6-8, 2011In this paper we present a proposal for the development of conversational agents that, on the one hand, takes into account the benefits of using standards like VoiceXML, whilst on the other, includes a module with a soft computing model that avoids the effort of manually defining the dialog strategy. This module is trained using a labeled dialog corpus, and selects the next system response considering a classification process based on neural networks that takes into account the dialog history. Thus, system developers only need to define a set of VoiceXML files, each including a system prompt and the associated grammar to recognize the users responses to the prompt. We have applied this technique to develop a conversational agent in VoiceXML that provides railway information in Spanish.Funded by projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-
02/TEC, CAM CONTEXTS (S2009/TIC-1485), and DPS2008-07029-C02-02.Publicad
Recommended from our members
Fewer epistemological challenges for connectionism
Seventeen years ago, John McCarthy wrote the note Epistemological challenges for connectionism as a response to Paul Smolensky’s paper 'On the proper treatment of connectionism'. I will discuss the extent to which the four key challenges put forward by McCarthy have been solved, and what are the new challenges ahead. I argue that there are fewer epistemological challenges for connectionism, but progress has been slow. Nevertheless, there is now strong indication that neural-symbolic integration can provide effective systems of expressive reasoning and robust learning due to the recent developments in the field
- …